MICE: Mining Idioms with Contextual Embeddings

نویسندگان

چکیده

Idiomatic expressions can be problematic for natural language processing applications as their meaning cannot inferred from constituting words. A lack of successful methodological approaches and sufficiently large datasets prevents the development machine learning detecting idioms, especially that do not occur in training set. We present an approach called MICE uses contextual embeddings purpose. a new dataset multi-word with literal idiomatic meanings use it to train classifier based on two state-of-the-art word embeddings: ELMo BERT. show deep neural networks using both perform much better than existing are capable use, even were demonstrate cross-lingual transfer developed models analyze size required dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Semantic Loop Idioms from Big Code

During maintenance, developers spend a lot of time transforming existing code: refactoring, optimizing, and adding checks to make it more robust. Much of this work is the drudgery of identifying and replacing specific patterns, yet it resists automation, because of meaningful patterns are hard to automatically find. We present a technique for mining loop idioms, surprisingly probable semantic p...

متن کامل

Etymology, Contextual Pragmatic Clues, and Lexical Knowledge in L2 Idioms Learning

To investigate the effects of etymological elaboration, contextual pragmatic clues, and lexical knowledge on L2 idioms comprehension and production, 60 male intermediate level EFL students in three groups were selected. Each group was randomly assigned to one treatment condition. Group one participants were presented with the etymological explanation of idioms. In group two, the same idioms wer...

متن کامل

Mining User Contextual Preferences

User preferences play an important role in database query personalization since they can be used for sorting and selecting the objects which most ful ll the user wishes. In most situations user preferences are not static and may vary according to a multitude of user contexts. Automatic tools for extracting contextual preferences without bothering the user are desirable. In this paper, we propos...

متن کامل

Contextual Itemset Mining in DBpedia

In this paper we show the potential of contextual itemset mining in the context of Linked Open Data. Contextual itemset mining extracts frequent associations among items considering background information. In the case of Linked Open Data, the background information is represented by an Ontology defined over the data. Each resulting itemset is specific to a particular context and contexts can be...

متن کامل

Mining Adverse Drug Reaction Mentions in Twitter with Word Embeddings

This paper describes our system used in the PSB 2016 Workshop on Social Mining Shared Task for adverse drug reaction (ADR) extraction in Twitter. Our system uses Conditional Random Fields to train a classifier for extracting ADR mentions. We leverage word representations from large amount of unlabeled tweets, both drug related and generic. Our experiment results show that cluster features deriv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge Based Systems

سال: 2022

ISSN: ['1872-7409', '0950-7051']

DOI: https://doi.org/10.1016/j.knosys.2021.107606